Search CORE

45 research outputs found

Estimating parameters of a multipartite loglinear graph model via the EM algorithm

Author: Bolla Marianna
Elbanna Ahmed
Publication venue
Publication date: 11/03/2015
Field of study

We will amalgamate the Rash model (for rectangular binary tables) and the newly introduced

\alpha

\beta

models (for random undirected graphs) in the framework of a semiparametric probabilistic graph model. Our purpose is to give a partition of the vertices of an observed graph so that the generated subgraphs and bipartite graphs obey these models, where their strongly connected parameters give multiscale evaluation of the vertices at the same time. In this way, a heterogeneous version of the stochastic block model is built via mixtures of loglinear models and the parameters are estimated with a special EM iteration. In the context of social networks, the clusters can be identified with social groups and the parameters with attitudes of people of one group towards people of the other, which attitudes depend on the cluster memberships. The algorithm is applied to randomly generated and real-word data

arXiv.org e-Print Archive

CiteSeerX

Modularity spectra, eigen-subspaces, and structure of weighted graphs

Author: Bolla Marianna
Publication venue: 'Elsevier BV'
Publication date: 01/12/2017
Field of study

University of Debrecen Electronic Archive

CLASSIFICATION OF MULTIGRAPHS VIA SPECTRAL TECHNIQUES

Author: BOLLA Marianna
TUSNÁDY Gábor
Publication venue: 'Periodica Polytechnica Budapest University of Technology and Economics'
Publication date: 01/01/1992
Field of study

Classification problems of the vertices of large multigraphs (hypergraphs or weighted graphs) can be easily handled by means of linear algebraic tools. For this purpose nocion of the Laplacian of multigraphs will be introduced, the eigenvectors belonging to k consecutive eigenvalues of which define optimal k-dimensional Euclidean representation of the vertices. In this way perturbation results are obtained for the minimal (k+1)-cuts of multigraphs (where k is an arbitrary integer between 1 and the number of vertices). The (k+1)-variance of the optimal k-dimensional representatives is estimated from above by the k smallest positive eigenvalues and by the gap in the spectrum between the kth and (k+1)th positive eigenvalues in increasing order. These results are of statistical character. However, they are useful and well-adopted to automatic computation in the case of large multigraphs when one is not interested in strict structural properties and, on the other hand, usual enumeration algorithms are very time-demanding

Periodica Polytechnica (Budapest University of Technology and Economics)

Percolated stochastic block model via EM algorithm and belief propagation with non-backtracking spectra

Author: Bolla Marianna
Zhou Daniel
Publication venue
Publication date: 20/08/2023
Field of study

Whereas Laplacian and modularity based spectral clustering is apt to dense graphs, recent results show that for sparse ones, the non-backtracking spectrum is the best candidate to find assortative clusters of nodes. Here belief propagation in the sparse stochastic block model is derived with arbitrary given model parameters that results in a non-linear system of equations; with linear approximation, the spectrum of the non-backtracking matrix is able to specify the number

k

of clusters. Then the model parameters themselves can be estimated by the EM algorithm. Bond percolation in the assortative model is considered in the following two senses: the within- and between-cluster edge probabilities decrease with the number of nodes and edges coming into existence in this way are retained with probability

\beta

. As a consequence, the optimal

k

is the number of the structural real eigenvalues (greater than

\sqrt{c}

, where

c

is the average degree) of the non-backtracking matrix of the graph. Assuming, these eigenvalues

\mu_1 >\dots > \mu_k

are distinct, the multiple phase transitions obtained for

\beta

are

\beta_i =\frac{c}{\mu_i^2}

; further, at

\beta_i

the number of detectable clusters is

i

, for

i=1,\dots ,k

. Inflation-deflation techniques are also discussed to classify the nodes themselves, which can be the base of the sparse spectral clustering.Comment: 29 pages, 16 figure

arXiv.org e-Print Archive